W1: Introduction to R

Ania Majewska and Reni Kaul
May 22, 2019

Why is R programming a useful skill?

About today's instructors

Ania Majewska

  • My research interest is in conservation biology and disease ecology
  • I use R in my research to:
    • organize and wrangle data
    • analyze data (statistical models)
    • model disease dynamics (transmission)
  • I mostly work on the monarch butterfly and it's protozoan parasite
  • My contact info is majewska@uga.edu

About today's instructors

Reni Kaul

  • I'm interested in understanding population extinctions
  • I use R in my research to:
    • organize data
    • develop machine learning algorithms
    • model population dynamics
  • I mostly work with microbial microcosm systems
  • My contact info is reni@uga.edu

Workshop Goals

The workshop series consists of four half day workshops each focusing on a different part of the research process

  • Introduction to R (Today: Ania Majewska and Reni Kaul)
  • Reproducible research (May 29; Deven Vishwas Gokhale and Reni Kaul )
  • Simulating infectious diseases (June 3; John Vinson)
  • Visualizing data (June 7; Robbie Richards)

workflow

Workshop Format

  • Introduction of instructor & how they use R
  • Review of past concepts
  • Outline of the day's goals
  • Work through data analysis project
  • Reflection on day's materials

Expectations

You

  • Come ready to participate
    • do the reading
    • review the materials beforehand
  • Ask questions
  • Help each other

Instructors

  • Come prepared
  • Help you find answers
  • Be a resource for you during the REU program

Outline

At the end of this workshop you should be able to…

  • calculate descriptive and inferential statistics of a dataset
  • create a figure from data

Today

Topics

  1. What is R and RStudio?
  2. Introduction to R and tidyverse

    Break

  3. Code along exercise

    Break

  4. Troubleshooting

  5. Exploring data in groups

    Break

  6. Wrap Up

  7. Datacamp.com

1. What is R and RStudio?

  • free, open source programming language with a pre-set functionality (base R)
  • Functionality is extented by R packages which are collections of functions and data sets developed by the community.
  • RStudio is a software program that makes R programming easier:
    • write and test code efficiently
    • organizing files into projects
    • intergrating other programming lanuguages (e.g., Latex)

RStudio

Let's open Rstudio!

RStudio

RStudio

2. Introduction to tidyverse

  • The tidyverse is a set of packages that work in harmony because they share common design.
  • Unique syntax used pipes (%>%) to connect data (object) to verbs (functions).

tidy syntax

tidyverse package

The tidyverse package is designed to make it easy to install and load core packages from the tidyverse in a single command.

tidy verse package

Install tidyverse package

  • We need to install the tidyverse package. This can be done by clicking on buttons in Rstudio or from the console using install.packages()
  • Install the package by running the line
# install all the packages in the tidyverse
install.packages("tidyverse") 

Load tidyverse package

  • Once a package is installed, we need to load it during our current R session.
  • This is done using library().
# load tidyverse library
library(tidyverse)

Download the W1_Exercise zip file

Unzip and open the folder.

Break

Resume in 15 min

Open W1_Exercise.Rproj Open W1_Exercise.Rmd

Break

Resume in 15 min

3. How to get unstuck: functions

  • Every function has a help page
  • ?function() to access

manual page

How to get unstuck: packages

  • Packages have a vignette and/or reference manual on cran.r-project.org

  • People often make their own tutorials too

How to get unstuck: packages

  • Cheatsheets published by RStudio

How to get unstuck: packages

  • Cheatsheets published out by RStudio
  • Contributed cheatsheets

cheatsheets

How to get unstuck: error messages

error

First, try to understand the error message. It can be very helpful.

How to get unstuck: error messages

  • But sometimes, the messages is cryptic. So use google!

google

5. Exploring further

Let's work in small groups (max of 3)

  • Develop a question that can be answered with this data
  • Decide on the needed verbs, and order
  • Write the code in the final code chunck of the W1_Exercise.Rmd file

More instructions in the file.

Break

Resume in 15 min

6. Wrap Up

We can…

  • calculate descriptive and inferential statistics of a dataset
  • create a figure from data

using

  • dplyr package
    • filter, arrange…
  • ggplot2 package
    • geom_point, facet…

W1

Wrap Up

Next week

Focus on communication by following best practices for reproducible research

W1

7. Datacamp.com

We will be using datacamp.com to build more skills!